Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Autotuning dense linear algebra libraries on GPUs

As GPUs are quickly evolving in complexity, tuning numerical libraries for them is becoming more challenging. We present an autotuning approach in the area of dense linear algebra (DLA) libraries for GPUs. The MAGMA library is used to demonstrate the techniques and their effect on performance and portability across hardware systems. We show that, figuratively speaking, our autotuning approach f...

متن کامل

Accelerating GPU Kernels for Dense Linear Algebra

Implementations of the Basic Linear Algebra Subprograms (BLAS) interface are major building block of dense linear algebra (DLA) libraries, and therefore have to be highly optimized. We present some techniques and implementations that significantly accelerate the corresponding routines from currently available libraries for GPUs. In particular, Pointer Redirecting – a set of GPU specific optimiz...

متن کامل

Optimizing Memory-Bound Numerical Kernels on GPU Hardware Accelerators

Hardware accelerators are becoming ubiquitous high performance scientific computing. They are capable of delivering an unprecedented level of concurrent execution contexts. High-level programming languages (e.g., CUDA), profiling tools (e.g., PAPI-CUDA, CUDA Profiler) are paramount to improve productivity, while effectively exploiting the underlying hardware. We present an optimized numerical k...

متن کامل

Scalable Dense Linear Algebra on Heterogeneous Hardware

Design of systems exceeding 1 Pflop/s and the push toward 1 Eflop/s, forced a dramatic shift in hardware design. Various physical and engineering constraints resulted in introduction of massive parallelism and functional hybridization with the use of accelerator units. This paradigm change brings about a serious challenge for application developers, as the management of multicore proliferation ...

متن کامل

Towards dense linear algebra for hybrid GPU accelerated manycore systems

0167-8191/$ see front matter 2010 Elsevier B.V doi:10.1016/j.parco.2009.12.005 * Corresponding author. Tel.: +1 865 974 8295; fa E-mail addresses: [email protected] (S. Tomov We highlight the trends leading to the increased appeal of using hybrid multicore + GPU systems for high performance computing. We present a set of techniques that can be used to develop efficient dense linear algebra alg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the IEEE

سال: 2018

ISSN: 0018-9219,1558-2256

DOI: 10.1109/jproc.2018.2868961